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Abstract 



A novel artificial neural network approach to constraint satisfaction problems is pre- 
sented. Based on information-theoretical considerations, it differs from a conventional 
mean-field approach in the form of the resulting free energy. The method, implemented as 
an annealing algorithm, is numerically explored on a testbed of K-SAT problems. The 
performance shows a dramatic improvement to that of a conventional mean-field ap- 
proach, and is comparable to that of a state-of-the-art dedicated heuristic (Gsat+Walk). 
The real strength of the method, however, lies in its generality - with minor modifications 
it is applicable to arbitrary types of discrete constraint satisfaction problems. 
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1 Introduction 



In the context of difficult optimization problems, artificial neural networks (ANN) based 
on the mean-field approximation provides a powerful and versatile alternative to problem- 
specific heuristic methods, and have been successfully applied to a number of different 
problem types flHopfield and Tank I985| ; [Peterson and Soderberg 1998|) . 



In this paper, an alternative ANN approach to combinatorial constraint satisfaction 
problems (CSP) is presented. It is derived from a very general information-theoretical 
idea, which leads to a modified cost function as compared to the conventional mean-field 
based neural approach. 



A particular class of binary CSP that has attracted recent attention is K-SAT (pa- 



Ipadimitriou 1994 ; Du et al. 1997|) ; many combinatorial optimization problems can be 



cast in K-SAT form. We will demonstrate in detail how to apply the information-based 
ANN approach, to be referred to as INN, to K-SAT as a modified mean-field annealing 
algorithm. 

The method is evaluated by means of extensive numerical explorations on suitable 
testbeds of random i^-SAT instances. The resulting performance shows a substantial 
improvement as compared to that of the conventional ANN approach, and is comparable 
to that of a good dedicated heuristic - Gsat+ Walk QSelman et al. 1994] ; |Gu et al. 1997]) . 



The real strength of the INN approach lies in its generality - the basic idea can easily 
be applied to arbitrary types of constraint satisfaction problems, not necessarily binary. 



2 K-SAT 



A CSP amounts to determining whether a given set of simple constraints over a set of 
discrete variables can be simultaneously fulfilled. 

Most heuristic approaches to a CSP attempt to find a solution, i.e. an assignment 
of values to the variables consistent with the constraints, and are hence incomplete in 
the sense that they cannot prove unsatisfiability. If the heuristic succeeds in finding a 
solution, satisfiability is proven; a failure, however, does not imply unsatisfiability. 

A commonly studied class of binary CSP is K-SAT. A i^-SAT instance is defined as 
follows: For a set of N Boolean variables Xi, determine whether an assignment can be 
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found such that a given Boolean function U evaluates to True, where U has the form 



U = (anORai20R . . . CL\k) and (fl21 0R • • • a 2I<) AND . . . AND (a^iOR . . . CLmk) > 



(1) 



i.e. U is the Boolean disjunction of M clauses, indexed by m — 1 . . . M, each defined as 
the Boolean conjunction of K simple statements (literals) a m k, k — 1 . . . K. Each literal 
represents one of the elementary Boolean variables Xi or its negation ->Xi. 

For K = 2 we have a 2-SAT problem; for K = 3 a 3-SAT problem, etc. If the clauses are 
not restricted to have equal length the problem is referred to as a satisfiability problem 
(SAT). There is a fundamental difference between i^-SAT problems for different values 
of K. While a 2-SAT instance can be exactly solved in a time polynomial in N, K-SAT 
with K > 3 is NP-complete. Every i^-SAT instance with K > 3 can be transformed 
in polynomial time into a 3-SAT instance ( Papadimitriou 1994]) . In this paper we will 
focus on 3-SAT. 



3 Conventional ANN Approach 
3.1 ANN Approach to CSP in General 

In order to apply the conventional mean-field based ANN approach as a heuristic to a 
Boolean CSP problem, the latter is encoded in terms of a non-negative cost function 
H(s) in terms of a set of N binary (±1) spin variables, s = {sj, i = 1, . . . , N}, such 
that a solution corresponds to a combination of spin values that makes the cost function 
vanish. 

The cost function can be extended to continuous arguments in a unique way, by de- 
manding it to be a multi-linear polynomial in the spins (i.e. containing no squared 
spins). Assuming a multi-linear cost function H(s), one considers mean-field variables 
(or neurons) Vi G [—1, 1], approximating the thermal spin averages (sj) in a Boltzmann 
distribution P(s) oc exp(—H(s)/T). They are defined by the mean-field equations 



where Ui is referred to as the local field for spin i. Here, T is an artificial temperature 
and v denotes the collection of mean-field variables. 



Vi = tanh(ui/T) 
dH(v) 



(2) 



(3) 
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where S(v) is the spin entropy, 



S(vH _ E i±^lo g (i±^-i^£log(^). (5) 



The conventional ANN algorithm consists in solving the mean-field equations (0, [3]) 
iteratively, combined with annealing in the temperature. A typical algorithm is described 
in figure [TJ. 



• Initiate the mean-field spins V{ to random values close to zero, and T to a high 
value. 

• Repeat the following (a sweep), until the mean- field variables have saturated 
(i.e. become close to ±1): 

— For each spin, calculate its local field from (|3]), and update the spin ac- 
cording to (0). 

— Decrease T slightly (typically by a few percent). 

• Extract the resulting solution candidate, using = sign(fj). 



Figure 1: A mean-field annealing ANN algorithm. 



3.2 Application to K-SAT 



When applying the ANN approach to i^-SAT the Boolean variables are encoded using 
±l-valued spin variables Sj, % = 1. ..N, with Sj = +1 representing True, and Sj = — 1 
False. In terms of the spins, a suitable multi-linear cost function H(s) is given by the 
following expression, 

M 

h & = e n 2 (i - CmSi) ' (6) 

m=l ieMm 

where M. m denotes the set of spins involved in the mth clause. H(s) evaluates to the 
number of broken clauses, and vanishes iff s represents a solution. The M x N matrix 
C defines the i^-SAT instance: An element C m i equals +1 (or —1) if the mth clause 
contains the zth Boolean variable as is (or negated); otherwise C m j = 0. 

The cost function defines a problem-specific set of mean-field equations, (|2||J), in 
terms of mean-field variables t>, G [—1, 1]. In the mean-field annealing approach (figure 
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[ip, the temperature T is initiated at a high value, and then slowly decreased (annealing), 
while a solution to (0,0) is tracked iteratively. At high temperatures there will be a stable 
fixed point with all neurons close to zero, while at a low temperature they will approach 
±1 (the neurons have saturated) and an assignment can be extracted. 

For the A-SAT cost function (0) the local field U{ in (0) is given by 

Ui = j2 \° mi n \ t i - c -^ v i) > ( 7 ) 

which, due to the multi-linearity of H does not depend on vf, this lack of self-coupling 
is beneficial for the stability of the dynamics. 

4 Information-Based ANN Approach: INN 

4.1 The Basic Idea 

For problems of the CSP type, we suggest an information-based neural network approach, 
based on the idea of balance of information, considering the variables as sources of 
information, and the constraints as consumers thereof. 

This suggests constructing an objective function (or free energy) F of the general form 

F = const, x (information demand) — const, x (available information) , (8) 

that is to be minimized. The meaning of the two terms can be made precise in a mean- 
field-like setting, where a factorized artificial Boltzmann distibution is assumed, with 
each Boolean variable having an independent probability to be assigned the value True. 
We will give a detailed derivation below for A-SAT. Other problem types can be treated 
in an analogous way. We will refer to this type of approach as INN. 



4.2 INN Approach to A-SAT 

Here we describe in detail how to apply the general ideas above to the specific case of 
A- SAT. 

The average information resource residing in a spin is given by its entropy, 

S(s t ) = -P Si=1 log P Si=1 - P Si= ^ log P Si= _! , (9) 
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where P are probabilities. If the spin is completely random, P Sl =i = P Si =-\ = \ and 
S(si) = log(2), representing an unused resource of one bit of information. If the spin is 
set to a definite value (sj = ±1), no more information is available and S(si) = 0. 

For a clause the interesting property is the expected amount of information needed to 
satisfy it. For the mth clause, this can be estimated as 

I m = - log = - log (1 - P-) , (10) 

in terms of the probability P^ for the clause to be satisfied in a given probability 
distribution for the spins. 

Of the 2 K distinct states available to the K spins appearing in the clause, only one 
corresponds to the clause being unsatisfied. Then, for a totally undetermined clause (all 
K spins having random values), we have P^ nsat = 2~ K , yielding I m = — log (l — 2~ A ). 
For a definitely satisfied clause, on the other hand, we must have P™ sat = 0, giving 
I m = 0. Finally, a broken clause corresponds to p^ 1 ^ = l ; leading to I m — ► oo. 

Assuming a mean-field-like probability distribution, with each spin obeying independent 
probabilities 

in terms of mean-field variables i>j = (sj) £ [—1, 1], the probabilities used above for the 
clauses become 

c nsat =n^- ■ (12) 

ieM m 

The unused spin information is given by the entropy S of the spins (eq. (|J)) and the 
information / needed by the clauses is 

J(v) = - l0 S ( 1 - II \ C 1 - ^0 ) • ( 13 ) 

m=l \ iGA^m / 



We now have the necessary prerequisites to define an information-based free energy, 
which we choose as F{y) = /(v) — TS(v) (in analogy with ANN), which is to be mini- 
mized. Demanding that F have a local minimum with respect to the mean-field variables 
yields equations similar to the mean- field equations (0||), but with H(y) replaced by 
/(v): 

«' = "f ■ <"> 

Note that for discrete arguments, v j = ±1, the infomation demand J will be infinite for 
any non- solving assignment. 
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4.3 Algorithmic Details 



Based on the analysis above, we propose an information-based annealing algorithm sim- 
ilar to mean-field annealing, but with the multi-linear cost function H @ replaced by 
the clause information / (|13|). 

Note that the contribution J m to / from a single clause m is a simple function of the 
corresponding contribution H m to H, 

J m = -log(l-# m ) . (15) 

As a result, the effective cost function / is not multilinear, and measures have to be 
taken to ensure stability of the dynamics. The resulting self-couplings can be avoided 
by instead of the derivative in (|i~4"f ) using the difference, 

«* = -\ ( J U=i - J U=-i) > (16) 

which coincides with the derivative for a multilinear / ( |Ohlsson et al. 1993| ). 

The resulting INN annealing algorithm is summarized in figure [2| At high temperatures, 



1. Choose a suitable high initial temperature T, such that the equilibrium neurons 
are close to zero. 

2. Do a sweep: Update all neurons according to (|2|,|T6|). 

3. Lower the temperature T by a fixed factor \i. 

4. If the stop-criteria are not met, repeat from 2. 

5. Extract a solution by means of S{ = sign(uj). 

A typical \i value is 0.95 - 0.99, and suitable stop-criteria are that all neurons are 
either saturated (\vi\ > 0.99) or redundant < 0.01). 



Figure 2: The INN annealing algorithm for K-SAT. 

information is expensive, and the neurons stay fuzzy, Vi ~ 0. As T is decreased, infor- 
mation becomes cheaper and the more useful neurons begin to saturate. As T — > 0, all 
neurons are eventually forced to saturate, yielding a definite spin state, ~ Si = ±1. 



5 Numerical Explorations 
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5.1 Testbeds 



For performance investigations, we have considered two distinct testbeds. One consists 
of uniform random .fT-SAT problems with N and a = M/N fixed ( flUook and Mitchell| 
1997])) . For every problem instance, each of the M clauses is independently generated by 



chosing at random a set of K distinct variables (among the N available). Each selected 
variable is negated with probability |. 

For this ensemble of problems, the fraction unsatifiable problems increases with the 
parameter a. In the thermodynamic limit (N — > oo) there is a sharp satisfiability 
transition at a i^-dependent critical a-value ai K ^ ([Hogg et al. 1996| ; [Monasson et al.| 
1999). For problems where a < ai almost all generated problems are satisfiable and 
for a > ai K ^ almost all are unsatisfiable. For 3-SAT, a c ~ 4.25 flUook and Mitchell 1997] ; 



Monasson et al. 1999] ). 



We have used a set of N- values between 100 and 2000, and for each N a set of a- values 
between 3.7 and 4.3. For each N and a, 200 problem instances are generated. 



In addition, testbeds consisting purely of satisfiable instances are useful to gauge the 
efficiency of a heuristic. Such a testbed can be generated by filtering out unsatisfiable 
instances (using a complete (exact) algorithm) from the uniform random distribution 
described above. 

For a second testbed, we have collected a set of instances of this type from SATLIBQ, 
consisting in satisfiable random problems for different N between 20 and 250, with a 
fixed close to a c . For natural reasons, this testbed does not include very large N. 



5.2 Comparison Algorithms 



To gauge the performance of the INN algorithm, we have in addition to the conventional 
ANN algorithm also applied a state-of-the-art dedicated heuristic to our testbeds. A 
wealth of algorithms has been tested on SAT problems. For a survey, see e.g. |Gu et al" 



1997. A local search method proven to be competitive is the gsat+walk algorithm which 



we will use as a second reference algorithm. 



Gsat+walk starts with a random assignment and then uses two types of local moves 
to proceed. A local move consists in flipping the state of a single variable between 
True/False. The first type of move is greedy; the flip that increases the number of 
satisfied clauses the most is chosen. The second type of move is a restricted random 

3 http : //www . inf ormat ik . tu-darmstadt . de/AI/SATLI^ 
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walk move. A clause among those that are unsatisfied is chosen at random, and then a 
randomly chosen variable in this clause is flipped. 



5.3 Implementations details 



In order to have a fair comparison of performances, we have chosen the parameter values 
such, that the three algorithms use approximately equal CPU time for each problem size. 



5.3.1 ANN 



For ANN a preliminary initial temperature of 3.0 is used, which is dynamically adjusted 
upwards until the neurons are close to zero (J2i v i < 0.1N), in order to ensure a start 
close to the high-T fixed point. 

The annealing rate is set to 0.99. At each temperature up to 10 sweeps are allowed in 
order for the neurons to converge, as signalled by the maximal change in value for a 
single neuron being less than 0.01. At every tenth temperature value, the cost function 
is evaluated using the signs of the mean-field variables, Sj = sign(uj); if this vanishes, 
a solution is found and the algorithm exits. If no solution has been found when the 
temperature reaches a certain lower bound (set to 0.1), the algorithm also exits; at that 
temparature, most neurons typically will have stabilized close to ±1 (or occasionally 0). 
Neurons that wind up at zero are those that are not needed at all or equally needed as 
±1. 



5.3.2 INN 



For the INN approach, the same temperature parameters as in ANN are used except for 
the low T bound, which is set to 0.5. Because of the divergent nature of the cost function 
I (p~3|) and the local field Ui (0), extra precaution has to be taken when updating the 
neurons - infinities appear when all the neurons in a clause are ±1 with the wrong 
sign: Vi = —C m i- When calculating U{, the infinite clause contributions are counted 
separately. If the positive (negative) infinities are more (less) numerous, i>j is set to +1 
(-1); otherwise, V{ is randomly set to ±1 if infinities exist but in equal numbers, else the 
finite part of it, is used. 



This introduces randomness in the low temperature region if a solution has not been 
found; the algorithm then acquires a local search behaviour increasing its ability to find 
a solution. In this mode the neurons do not change smoothly and the maximum number 
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of updates per temperature sweep (set to 10) is frequently used, which explains why INN 
needs more time than the conventional ANN for difficult problem instances. Performance 
can be improved, at the cost of increasing the CPU time used, with a slower annealing 
rate and/or a lower low-T bound. Restarts of the algorithm also improves performance. 



5.3.3 gsat+walk 



The source code for gsat+walk can be found at SATLIB []. We have attempted to follow 
the recommendations in the enclosed documentation for parameter settings. The prob- 
ability at each flip of choosing a greedy move instead of a restricted random walk move 
is set to 0.5. We have chosen to use a single run with 200 x iV flips per problem, instead 
of several runs with less flips per try, since this appears to improve overall performance. 
Making several runs or using more flips per run will improve performance at the cost of 
an increased CPU consumption. 



5.4 Results 



Here follow the results from our numerical investigations for the two testbeds. All ex- 
plorations have been made on a 600 MHz AMD Athlon computer running Linux. 

The results from INN, ANN and Gsat+Walk for the uniform random testbed are sum- 
marized in figures |3], f|, and [5[ 

In figure the fraction of the problems not satisfied by the separate algorithms (fu) 
is shown as a function of a for different problem sizes N. The three algorithms show 
different transitions in a above which they fail to find solutions. For INN and gsat+walk 
the transition appears slightly beneath the real a c , while for ANN the transition is 
situated below a = 3.7. 

The average number of unsatisfied clauses per problem instance (H) is presented in figure 
[| for the three algorithms. H is shown as a function of a for different N. This can be 
used as a performance measure also when an algorithm fails to find solutions Q. 

The average CPU-time consumption (t) is shown in figure [5] for all algorithms. The CPU- 
time is presented as a function of N for different a in order to show how the algorithms 
scale with problem size. 

The results (fu, H, t) for the solvable testbed for all three algorithms are summarized 

" frittp : //wb . inf ormat ik . tu-darmstadt . de/AI/SATLI^ 



5 Finding a maximal number of satisfied clauses for a SAT instance is referred to as MAXSAT ( Papadimitriou 1994) 
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Figure 3: Fraction unsatisfied problems (fu) versus a for ANN (A), INN (B) and gsat+walk (C), for 
N = 100 (+), 500 (x), 1000 (*), 1500 (□) and 2000 (■). The fractions are calculated from 200 instances; 
the error in each point is less than 0.035. 
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Figure 4: Number of unsatisfied clauses H per instance versus a, for ANN (A), INN (B) and gsat+walk 
(C), for N = 100 (+), 500 (x), 1000 (*), 1500 (□) and 2000 (■). Average over 200 instances. 



ill table |. 



5.5 Discussion 



The first point to be made is the dramatic performance improvement in INN as compared 
to ANN. This is partly due to the divergent nature of the INN cost function /, leading 
to a progressively increased focus on the neurons involved in the relatively few critical 
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Figure 5: Log-log plot of used CPU-time t (given in seconds) versus N, for ANN (A), INN (B) and 
gsat+walk (C), for a = 3.7 (+), 3.9 (x), 4.1 (*) and 4.3 (□). N ranges from 100 to 2000. Averaged 
over 200 instances. 
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0.000 


0.01 


50 


218 


1000 


0.607 


0.759 


0.04 


0.194 


0.208 


0.05 


0.008 


0.008 


0.02 


75 


325 


100 


0.84 


1.3 


0.07 


0.41 


0.44 


0.11 


0.05 


0.05 


0.05 


100 


430 
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0.844 


1.485 


0.09 


0.315 


0.362 


0.13 


0.072 


0.074 


0.09 
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538 
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0.88 


1.72 


0.11 


0.39 


0.41 


0.18 


0.10 


0.10 


0.13 
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0.89 


2.07 


0.14 


0.34 


0.4 


0.23 


0.16 


0.17 


0.19 


175 


753 


100 


0.98 


2.6 


0.17 


0.51 


0.61 


0.39 


0.27 


0.28 


0.33 


200 


860 
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1 


3.06 


0.20 


0.6 


0.81 


0.52 


0.32 


0.34 


0.39 


225 


960 
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0.97 


3.15 


0.22 


0.52 


0.67 


0.51 


0.35 


0.37 


0.46 


250 


1075 


100 


0.99 


3.53 


0.25 


0.58 


0.77 


0.65 


0.39 


0.44 


0.53 



Table 1: Results for the solvable 3-SAT problems close to a c . fu is the fraction of problems not satisfied 
by the algorithm, H is the average number of unsatisfied clauses (|^) and t is the average CPU-time used 
(given in seconds). The third column (num inst.) is the number of instances in the problem set. 



clauses on the virge of becoming unsatisfied. This improves the revision capability which 
is beneficial for the performance. The choice of randomizing to ±1 (which appears 
very natural) in cases of balancing infinities in contributes to this effect. 

A performance comparison of INN and gsat+walk indicates that the latter appears to 
have the upper hand for small N. For larger N however, INN seems to be quite compa- 
rable to gsat+walk. 
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6 Summary and Outlook 



We have presented a heuristic algorithm, INN, for binary satisfiability problems. It is a 
modification of the conventional mean-field based ANN annealing algorithm, and differs 
from this mainly by a replacement of the usual multilinear cost function by one derived 
from an information-theoretical argument. 

This modification is shown empirically to dramatically enhance the performance on a 
testbed of random if- SAT problem instances; the resulting performance is for large 
problem sizes comparable to that of a good dedicated heuristic, tailored to K-SAT. 

An important advantage of the INN approach is its generality. The basic philosophy - 
the balance of information - can be applied to a host of different types of binary as well 
as non-binary problems; work in this direction is in progress. 
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